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Abstract In this paper, we apply a new statistical analysis technique. Mean Field ap¬ 
proach to Bayesian Independent Component Analysis (MF-ICA), on galaxy spectral anal¬ 
ysis. This algorithm can compress the stellar spectral library into a few Independent 
Components (ICs), and galaxy spectrum can be reconstructed by these ICs. Comparing to 
other algorithms which decompose a galaxy spectrum into a combination of several sim¬ 
ple stellar populations, MF-ICA approach offers a large improvement in the efficiency. To 
check the reliability of this spectral analysis method, three different methods are used: (1) 
parameter-recover for simulated galaxies, (2) comparison with parameters estimated by 
other methods, and (3) consistency test of parameters from the Sloan Digital Sky Survey 
galaxies. We find that our MF-ICA method not only can fit the observed galaxy spectra 
efficiently, but also can recover the physical parameters of galaxies accurately. We also ap¬ 
ply our spectral analysis method to the DEEP2 spectroscopic data, and find it can provide 
excellent fitting for those low signal-to-noise spectra. 

Key words: methods: data analysis - methods: statistical - galaxies: evolution - galaxies: 
fundamental parameters - galaxies: stellar content 


1 INTRODUCTION 


Spectrum contains plentiful information about the properties of galaxy dKong et al.L l2014l) . Einding a 
way to analyze the spectra of observed galaxies and determine the parameters of large sample of galax¬ 
ies would not only help us to investigate galaxy formation and evolution, but also allow us to derive 
cosmological information from a large number of galaxies dConrovl l2013h . Many methods, based on 
the used features, were devis ed to measure and und erstand the physical p arameters of galaxies, whether 
by us ing the spectral indices dWorthev et al. . 1994ll or emission features ( Kewley_gtaLi 2001; Shie^^ 


20141). or by f itting f ull spectru m ( Tremonti et al.]2004l : ICid Eernandes et al. . 2005 : Ocvirk et al. . 2006 


Toieiro et al.L 120071 : lUiu et all l2013h . Due to the abundance of high-quality galaxy spectra, two dif¬ 


ferent population synthesis approaches have b een common l y used to s t udy the stellar contents of 
galaxy. Empirical population synthesis method dEabeii I1972L Bical Il988l: ICid Eernandes et ^ 12001 1: 
iKong et ^ l2003h is based on modelling galaxies as mixture of several observed spectra of stars or 
star clusters. However, this method do not consider the stellar evolution and is limited by the observed 
stellar/ cluster spectral l ibrary. Recently, a more direct approach so-called evolutionary population syn¬ 


thesis ( Vazdekisl 1999l : lGirardi et al.p2000tlBruzual & Chariot! [2003. hereafter BC03I : iMarastonl 120051 : 


IChen et 2015 ) has been widely used. In this approach, the spectra of stellar populations are mod¬ 
eled by combining stellar evolution tracks, SSP library and star formation histories (SEHs). Up to now, 
a popular simple stellar populations (SSPs) library was provided by the isochrone synthesis technique 
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(BC03). Several group have selected a few SSPs from this library as templates to fit the observed galaxy 
spectra (iTremonti et al.Ll?004l:ICid Fernandes et al.Ll2005h . 


Ho wever, the advent of large- a rea sp ectroscopic surveys, such as the Third Sloan Digital Sky 
Survey (ISDSS-III: Eisenstein et all 201 li, the Deep Extragalactic Evolutionary Probe 3 (DEEP3) 


Galaxy Re dshift Survey (C ooper et aU 201 ll) . and the Large sky Area Multi-Object fiber Spectroscopic 


Telescope (ILAMOST: Cui et all 2012h . will be providing oceans of data, thus the development of fast 
and automated extraction methods are required. We note that statistical analysis techniques have been 
commonly implemented. Eor example, Richards et al. (2009) utilized the diffusion fc-means method to 
draw several p rototype spectra from SSP database as input templates of the spectral synthesis software 
STARLIGHT (ICid Eernandes et nil2005h . Nolan et al. (2006) applied a data-driven Bayesian approach 
to the spectra of early-type galaxies. A nother blind source separation (BSS) technique applied to spectra 
is prin cipal component analysis (PCA. iMittaz. Pension & Sniidersll9^lKong & Cheng|200lllYip et al.l 
I 2 OO 4 I) . but the interpretation of the individual component spectra seems rarely illuminating. Here, we 
explore a new statistical multivariate data processing technique, independent component analysis (ICA), 
in our s pectral analysis. Th is technique has been impl emented i n the Cosmic Microwav e Background 
studies (IMaino et all bOO*^ and the analysis of spectra JLu et al.Ll2005lAllen et al.ll2013h . however, the 
Ensemble Learning ICA (EL-ICA, also known as naive mean field or NME) m ethod used in Lu et al . 
(2006 ) is known to fail in some circumstances (e.g. low signal-to-noise spectra) (iHpien-Sbrensen et al.L 
l200lh . and Allen et al. (2013) applied to only emission-line galaxies. Eor the sake of non-negative values 
in the galaxy spectral analysis, we adopt a new ICA algorithm, mean field approach independent com¬ 
ponent analysis (ME-ICA), which can constrain the sources and the mixing matrix to be non-negative 
with a more efficient and more reliable algorithm. 


The paper is structured as follows. In Section 2, we introduce the ME-ICA method, and derive a 
few templates from evolutionary population models of Chariot & Bruzual (2007) to analyze the spectra 
of galaxy. In Section 3, the simulated galaxy spectra are used to analyze the reliability of the ME-ICA 
method. In Section 4, we analyze the SDSS observed galaxy spectra, compare our results with those 
obtained from the MPA/JHeQ catalogs, to investigate whether our synthesis results are reasonable. In 
Section 5, some galaxy spectra from the DEEP2 galaxy redshift survey are fitted by our method, and 
our conclusions are outlined in Section 6. 


2 METHOD 


2.1 Stellar Population Models 


Stellar population models can be generated by several population sy nthesis codes, here we adopt the 
2007 version of GalaxevQ (ICharlot & Bruzuall 2007, CB07 hereafteil) . which is a new version of BC03. 
The CB07 models have undergone a major improvement recently with the new stellar evolution pre¬ 
scriptions of Marigo & Girardi (2007) for the Thermally-Pulsing Asymptotic Giant Branch (TP-AGB) 
evolutionary phase of low- and intermediate-mass stars. An accurate modelling of this phase is related 
to correct predicted fluxes at the wavelength range of 1 — 2.5pm (Chariot & Bruzual, 2007). 


The CB07 models use an empirical spectral library with a range of wavelength (9lA—36000pm, 
N = 6917), and spectral resolution about 3A. Moreover, CB07 contains a large sample of SSP, which 
covers 221 different ages from 1.0 x 10® to 2.0 x 10^° yr, and a wide range of initial chemical compo¬ 
sitions, Z = 0.0001, 0.0004, 0.004, 0.008, 0.02, 0.05 and 0.1 (Zq = 0.02). The observed spectrum of 
a galaxy can be expressed as a combination of these individual SSPs with weights. This SSPs database 
will be used to derive our templates in Section l2.2.3l 


* http://www.mpa-garching.mpg.de/SDSS/ 
^ http://www.bruzual.org 
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2.2 MF-ICA technique 

2.2 .1 Independent Component Analysis (ICA) 
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ICA is a new multivariate data processing method which aims a t decomposing complex multivariate 
observations to a combination of a few hidden original sources jHvvarinen et al.l 1200 ih . Comparing 
to the traditional multivariate data processing methods, such as principal component analysis (PCA) 
or factor analysis, ICA is much powerful for finding the hidden sources, even when those traditional 
methods fail completely. The following generative model of ICA shows that multivariate observations 
or mixed signals x®, i=l, 2, ..., m, are a combination of hidden sources, i.e. Independent Components, 
hk, k=l, 2, ..., n, with additive Gaussian noise F®, weighted by the mixing weights (m x n): 

n 

x® = ^Wfe/ifc + r® (i = l,2,...,m). (1) 

fe=i 

In our analysis, we take the multivariate observations as the spectra of stellar systems (e.g. SSP 
database), and adopt the assumption that each spectrum /®(A) can be expressed as a sum of several 
Independent Components (ICs), /C'fe(A), so the model can be written as: 

n 

/®(A) = ^</Cfc(A) + r®(A) (2) 

fc=i 


Here, we only know the spectrum /®(A), the unknown mixing weights w]., the independent components 
ICk{\) and the no ise can be estimated from ICA algorith ms, such as Joint Ap proximate Diagonalizatio n 
of Eigenmatrices ( JADE: Cadoso & Souloumiaci 119931) . extended InfoMax (iBell & Seinowskil Il995h . 
EastICA (iHvvarinen et al.L 200 ih. Ensembl e Lining ICA (EL-ICA; Miskin & Mackay 2001), Mean 
Eield ICA ( ME-ICA: Hdien-Sbrensen et al.l 1200^ and many others. 


2.2.2 Mean Field approach ICA (MF-ICA) 

The ICA algorithm we adopt to our spectral analysis is ME-ICA method. Comparing to other algorithms, 
ME-ICA is a Bayesian iterative algorithm which can constraint sources and mixing matrix to be positive 
by offering priors of them. The main advantage of ME-ICA algorithm is its implementation simplicity 
and generality. 

In this approach, the likelihood for the parameters and sources is defined as P(X|W, S, H) given 
by: 

P(X|W, S, H) = (3) 

where W is mixing matrix, X= [x ^,..., is mixed signals matrix, S is noise covariance matrix, 
n is the number of input source signal, and det is the determinant of the matrix. The likelihood of the 
parameters is defined as P(X|W, S) obtained from: 

P(X|W,S) = y dHP(X|W,S,H)P(H). (4) 

If priors on the mixing weight P(W) and the sources P(H) are taken into account, then the pos¬ 
teriors of sources and mixing matrix are obtained from P(H|X, W, S) cx P(X|W, S, H)P(H) and 
P(W|X, S) oc P(X|W, S)P(W), respectively. In ME-ICA method, the noise covariance S and mix¬ 
ing matrix W can be obtained from maximum a posteriori, while sources H can be obtained from their 
posterior mean. The mean field approach can be solved by: 

H = (H) (5) 


W = X(H^)(HH'^)-i 


(6) 
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s = i ((X - WH)(X - WH)T), (7) 

n 

where (•) = (•)h|w,s.x denotes the posterior average with respect to the sources given the mixing 
matrix and noise covariance. The solution of MF-ICA algorithm is equal to update noise covariance (Eq. 
7) and mixing matrix (Eq. 6) alternatively, and estimate sources (Eq. 5). Thus the optimized matrices 
of mixing matrix W, noise covariance S, and sources H can be derived from this iterative method. 
More details about the ME-ICA method can be found in Hpjen-Sprensen et al. (2002) and the available 
Matlab toolbox (http://mole.imm.dtu.dk/toolbox/ica). 

Through the Bayesian inference of the mixing matrix and sources, their priors can be constrained 
to be non-negative, which will be useful in observed galaxy spectrum processing, since the spectra 
paramete rs should not be negative. Although EL-ICA method has been implemented in galaxy spectral 
analysis (ILu et al.l l2006h . here we adopt the ME-ICA method, which relies on advanced mean field 
approaches: linear response theory and an adaptive version mean-field approach. Hpjen-Sprensen et al. 
(2001, 2002) have investigated both ME-ICA and EL-ICA methods, they concluded that comparing to 
the EL-ICA method, the advanced mean field approaches can recover the correct sources even when the 
ensemble learning theory fails, and the convergence rate of ME-ICA method is found to be faster. The 
comparison of these two ICA methods will be described in Section Uj] 


2.2.3 SSPs Spectral Analysis using MF-ICA 

Through the multivariate data processing technique, we expect to derive a minimal number of non¬ 
negative templates, which can represent the spectra of galaxy with minimal loss of information. Here, we 
adopt the ME-ICA algorithm to compress the spectral library of SSP from CB07 models (Section lSTl ). 

The SSP database of CB07 contains 1547 spectra (Section lZTI) . each spectrum was first truncated to 
the high resolution wavelength range of 3322 — 9200A, to match that of SDSS spectrograph. In EL-ICA 
method, the number of sources (i.e. ICs) should be same as the number of mixed signals. Therefore, Lu 
et al. (2006) picked up a subsample out of the BC03 SSP database as the mixed signals matrix X in the 
EL-ICA method, and estimated 74 hidden spectra. Einally they choose several ICs from these hidden 
spectra by the average fractional contribution to BC03 SSP database. However, the ME-ICA method 
we applied can do the dimensional reduction. Here the whole CB07 SSP database was set as the input 
mixed signals matrix X, then the ME-ICA method will be applied to them, and the output ICs will be 
more precise. To avoid negative values appearing in spectral analysis, we set the priors of mixing matrix 
and sources as positive. 

As has been mentioned above, the number of ICs can be less than mixed signals number in ME-ICA 
method, thus it should be predefined. The correct number can be determined as follows. We apply Root- 
Sum-Square (RSS) method to select the proper number of ICs. The value of RSS between the original 
mixed signals (i.e. whole SSP database) and the recovered mixed matrix can be calculated by: 


RSS 


1/2 


SB-;- 

,i=i i=i 


( 8 ) 


where the recovered mixed matrix X is calculated from the estimated mixing matrix and sources: X = 
WH. We preset the initial number of sources as one, then increase the number and the value of RSS 
will be reduced. Repeated this process until the reduction is no longer significant. Einally, the number 
of ICs can be set as 12. 

Using the number of ICs we determined, the sources can be obtained from ME-ICA calculation. 
Therefore, the SSP database can be compressed into 12 ICs, we present these 12 ICs in Pigure4 of 
ISu etal.l(l2013h . 
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To confirm the reliability and quality of the ICs, we used the estimated 12 ICs to recover the 1547 
CB07 SSP database as follows; 


12 

fsspW = E<ICk(A) (i = 1,2,..., 1547), (9) 

k=l 

And we found that the spectra reconstructed by these 12 ICs excellently match with the SSP 
database. 

2.3 Fitting Galaxy Spectra 

The aim of this study is using these estimated 12 ICs to fit galaxy spectra from large surveys. The SFHs 
of galaxy can be approximated as a combination of discrete bursts, thus the population of galaxy can be 
decomposed into SSPs combination. As shown in Section [2.2.3l the SSP database can be compressed 
into 12 ICs, so the model of observed galaxy spectra, fg(A), can be fitted by these 12 ICs as: 

12 

fg(A) =?'(A) ^a/cICfc (A, cr), (10) 

k=l 

where r(A) is the reddening term, describes the intrinsic starlight reddening, can be modeled by the 
extinction law of Chariot & Fall (2000). ICfc (A, cr) is the fc—th IC convolved with a Gaussian func¬ 
tion. The Gaussian width a corresponds to the stellar velocity dispersion of a galaxy. During the fit¬ 
ting process, we mask points around the prominent lines, such as Balmer lines (He, H 7 , H(5, H/3, Ha) 
and strong forbidden lines ([Oii]A3727, [Ne iii]A3869, [O iii]AA4959,5007, [Hei]A5876, [Oi]A6300, 
[Nii]AA6548,6584, and [S ii]AA6717, 6721). 

After subtracting the modeled stellar population spectrum, emission lines can be fitted with 
Gaussians simultaneously, similarly as Tremonti et al. (2004): the forbidden lines ([Oil], [Olll], [Ol], 
[N II] and [S ll]) are set to have the same line width and velocity offset, likewise for Balmer lines (H 7 , 
H(5, H/3 and Ha). By using the procedures above, the observed galaxy spectra can be quickly recovered. 

3 RELIABILITY OF THE FITTING METHOD 
3.1 Simulations 

In this section, we analyze the simulated galaxy spectra to examine the reliability of the MF-ICA 
method. All simulated spectra are generated from 2007 version of BC03 stellar population synthesis 
code. For sake of simplicity, we parameterize each SFH of simulated galaxy in term s of an underlying 
continuous model superimposed with random bursts on it (iKauffmann et al.L l2003h . The spectral en¬ 
ergy distribution (SED) at time f of a stellar population characterized by an exponentially declining star 
formation law ^(f) ex is given by: 

Fxit)= [ (11) 

Jo 

where Z) is the power radiated by an SSP of age t' and metallicity Z per unit wavelength per unit 

initial mass. 

The added SFHs are described below: 

1. The time when a galaxy begins forming stars t is distributed uniformly between 0.1 and 13.5 Gyr. 
Star formation timescale 7 is uniformly distributed between 0 and 1 Gyr“^. 

2. Random bursts occur at any time with same probability. Bursts are parameterized in terms of the 
fraction of stellar mass produced, which is logarithmically distributed between 0.03 and 4, and their 
duration can vary between 0.03 and 0.3 Gyr. 
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3. The metallicities Z are uniformly distributed between 0.02 Z© and 2 Z©, which represent the range 
of stellar metallicities inferred from the spectra of ^ 2 X 10^ SDSS galaxies. 

We apply our spectral analysis method on 500 simulated spectra over the range 3322 — 9200A. We 
also use the extinction law of Chariot (& Fall (2000) to attenuate each spectrum, where the absorption 
optical depth ry is uniformly distributed between 0 and 5. The velocity dispersion cr is distributed 
uniformly between 50 kms“^ and 450 kms“^. Finally we added Gaussian noise with signal-to-noise 
ratio S/N = 10, 20, 30, respectively. 

3.2 Results 

From the fitting of simulated spectra, we expect to examine the reliability of our spectral analysis 
approach which based on MF-ICA algorithm. Our main parameters of interest are Ay, cr, t and Z. 
Following steps are used to estimate age and metallicity: 

1. The pure spectrum of stellar system of a galaxy, fg{X), can be recovered by ICs, and it also can be 
represented by a combination of N SSPs. Thus we can solve the equation: 


12 


/g(A) = > aaCk(A) = > MspW- 




N 

j=l 


( 12 ) 


2. We adopt 60 SSPs from CB07 database include models of 15 different ages (t = 0.001, 0.003, 
0.005, 0.01, 0.025, 0.04, 0.1, 0.2, 0.6, 0.9, 1.4, 2.5, 5, 11, 13 Gyr) and 4 different metallicities 
(Z = 0.004, 0.008, 0.02,0.05). 

3. After solving Eq. (Ell, the age and metallicity can be solved by: 


60 

j=i 


(log t)L = V6jlog(tj) 


(13) 


60 


log(Z)L = logY^bjZj 


(14) 


j=i 


Figure [T] shows the input parameters against estimated values from simulated spectra with S/N = 10, 
20, 30. Clearly, the values of starlight reddening Ay and stellar velocity dispersion a are almost well 
recovered. The mean square errors (MSE) between recovered and input values are less than 0.20 and 
7.45, respectively, as shown in Table 1. 

Erom the above method, a galaxy spectrum can be decomposed of 60 SSPs with weights. The 
estimated weights bj can reflect the fractional contributions of j—th SSP with age tj and metallicity 
Zj. Therefore, the light-weighted age and metallicity can be estimated. As shown in Pigure[T] (bottom 
panel), the recovered and input values of (log t)L have n o significant differen ce with MSE less than 
0.20. According to the age-metallicity degeneracy problem dBressan et ali[l99^ . the values of log(Z)L 
recovered by our method cannot be exactly accurate. However, we can estimate them with meaningful 
accuracy, the Spearman’s rank correlation coefficient between output and input log(Z)L is about 0.70 
for S/N = 20. Einally, the summary of mean square error of parameters from simulated spectra can be 
found in Table 1. 


3.3 Compare to EL-ICA method 

To carefully test the influ ence of ICA algorithms, we re-estimated the ICs by EL-lCA algorithm 
(iMiskin & MacKa^l200lh . The EL-ICA method, which also known as naive mean held ICA method, 
has been applied in galaxy spectrum analysis by Lu etal. (2006). Here we used the same steps as Lu 
et al. (2006) and also derived six ICs, we present these six ICs in Eigure|2] 
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We use these ICs to refit the simulated spectra, the input parameters against output estimated values 
and the mean square error between them for simulations with S/N=20 are shown in Figure 0 The 
dispersions of parameters derived by EL-ICA method are larger than those by our method, which were 
shown in Figure [T] And the mean square errors of starlight reddening, velocity dispersion, stellar age, 
metallicity {Ay, a, (log t)L, log(Z)L) are 0.421, 33.841, 0.405, and 0.299 for S/N=20, respectively. 
These values are much larger than those of our method, as shown in Table 1. Finally, we also fit the SSP 
database using these ICs, the spectra recovered are not as good as those by our method. We conclude 
that our method which based on MF-ICA algorithm is more precise and reliable. 

4 APPLICATION TO SDSS SPECTRA 

Using our MF-ICA spectral analysis method, we fit the SDSS galaxy spectra, analyze stellar population 
properties of them, and measure their emission-line properties from the starlight-subtracted spectra. In 
this section, we compare the physical parameters obtained from stellar population analysis of continuum 
and measurements of emission lines. We also compare parameters estimated from our fitting technique 
with those derived by the MPA/JHU group. Because the aim of this section is to test whether the results 
from our spectral analysis method are reasonable and meaningful, we would not investigate their physics 
properties, such as the formation and evolution, of galaxies. 

4.1 Data preparation 

The Sloan Digital Sky Survey dSDSS: York et al.t l2000t) has released huge amounts of high-quality 
observed spectra of objec ts. In this work, our spec tra sample were extracted from spectroscopic plates 
of SDSS Data Release 8 (IDR8: Aihara et all 1201 ih . Moreover, we choose the objects which have been 
spectroscopically classified as galaxies. The spectra obtained from SDSS span a wavelength from 3800A 
to 9200A with mean spectral resolution R = A/AA 1800, and taken within three arcsecond diameter 
fibers. We finally fit about one million spectral sample of galaxies with redshift less than 1, which 
obtained from SDSS spectroscopic pipeline. 

We use the MF-ICA method, which was described in Section l231 to fit the spectral sample of galax¬ 
ies from SDSS. Firstly, the spectra of galaxies were corrected for the foreground Galactic extinction, 
using the maps of Schlegel et al. (1998). Then, they were transformed into the rest frame, with spectro¬ 
scopic redshifts. The spectral fittings results give a median x^/d.o.f (degree of freedom) of 1.13, nearly 
to excellent value of 1 as we expected. Figure |4] shows some examples of the fitting, the spectra can 
be well recovered by eye-inspection, which suggest that our MF-ICA spectral analysis approach works 
well. 

4.2 Comparisons with the MPA/JHU database 

The MPA/JHU group haytrovided catalogs of estimated physical parameters of SDSS galaxies publicly 
available on the websiteljThey inferred the SFHs of DR8 galaxies on the basis of CB07 models, which 
was similar to our researches. Here we compare our own estimated parameters, such as the emission 
line measurement, and stellar population properties, with those obtained from the MPA/JHU catalogs. 
Although we do not expect our estimated parameters that perfectly consistent with them, we analyze the 
relationships between these parameters to examine the accuracy and reliability of MF-ICA algorithm. 

4.2.1 Stellar extinction 

In our fitting technique, the extinction of optical galaxy spectra is modelled as one free parameter Ay. In 
Figure|5^), we plot the values of Ay estimated by our method, versus those estimated by the MPA/JHU 
group, which adopt the same attenuation curve by Chariot & Fall (2000). Since only few galaxies with 


^ See http://www.sdss3.org/dr8/spectro/spectro_access.php 
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A\f < 0 are found in previous research works, we constrain Ay to be positive in our analysis. Therefore, 
this constrains will not have significant impact on the results of our analysis. 

We adopt the value of Spearman’s rank correlation coefficient to describe the relationship be¬ 
tween two variables. As shown in Figure [5^), our results are well and linearly correlated to those ex¬ 
tracted from the MPA/JHU catalogs, with Vg = 0.69. However, the extinction values Ay which obtained 
from the MPA/JHU database are systematically lower with our values, similar as found in Chen et al. 
(2012, in Figure 3f). One possible reason for this discrepancy is that we only use the optical-band spectra 
to estimate the stellar extinction Ay. 


4.2.2 Stellar mass 

By using our stellar population analysis method, the light-weighted stellar mass log(M)L of SDSS 
galaxy also can be estimated. We calculate M/L ratio by weighed-added M/L ratios of each SSP compo¬ 
nent, and then derive the stellar mass by multiplying it by luminosity. In Figurej^^), we plot our estimate 
stellar mass against the MPA/JHU extinction-corrected stellar mass. The results from two methods are 
well consistent, with rg = 0.90. The small discrepancy is caused due to the different estimate methods. 
In our method, the stellar masses are obtained from the MjL ratio, which estimated through the best 
model. While the MPA/JHU group estimated their M/L ratio through Bayesian inference method, 
which connect two indices, HJa and Un(4000), with a model obta i ned fr om a large library of Monte 
Carlo realizations of galaxies with different SFHs dKauffmann et al.Ll2003h . 


4.2.3 Emission lines and nebular metallicities 

In our case the emission lines were measured from starlight-subtracted spectrum. The MPA/JHU group 
adopted a similar method, however, they only used a single metallicity CB07 model to fit the observed 
continuum. We plot our estimated equivalent widths (EWs) of emission lines, such as H/3, [O III] A5007, 
[Ol]A6300, Ha, [N ll]A6584 and [S ll]A6717 versus those measured by the MPA/JHU group in Figure|6l 
As shown in Figure|6] our values are consistent with those measured by the MPA/JHU group, with 
small discrepancy. We adopt the MSB to quantify the discrepancy between them. On the whole, the 
MSB of all these values are less than 1, suggest that there are no significant difference. The small 
discrepancy appearing is due to different measurement of the synthesized spectrum, which related to 
different subtracted stellar absorption. 

We also compared the value of nebular oxygen abundance 12 + log(0/H), which can be obtained 
from the equation described in Tremoni et al. (2004). As shown in Bigure|7] our estimated values of 
nebular oxygen abundance show a high degree of correspondence with those drawn from the MPA/JHU 
catalog. The value of Spearman rank coefficient is 0.99, nearly to a perfect Spearman correlation of 
rg = 1, which indicates the perfect monotonic relationship. 

This part is summarized as follows. We have compared estimated parameters such as the stellar 
extinction, stellar mass and emission line measurement with those obtained from the MPA/JHU catalogs. 
According to the analysis of relationships between these parameters, we conclude that MB-ICA method 
is reasonable and reliable. 


4.3 Empirical relations 

In this subsection, another way was used to test the accuracy of our method. The parameters estimated 
from the analysis of continuum were compared with those estimated by using measured emission lines. 
We analyze the relationships between these parameters to investigate whether our method derived results 
are reasonable. 
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4.3.1 Relations between Balmer features and stellar age 

The value of 4000 A break index dBalogh et al.Lll999ll can reflect age of galaxy. Higher £)n(4000) val¬ 
ues relates to older, metal-rich gal axies, while lower va l ues re lated to younger stellar subpopulation 
of galaxies. The strength of H(5 a (IWorthev & OttavianiL Il997h is another age indicator. Strong H^a 
absorption of galaxy reflects a burst of star formation occurred in the past 0 — 1 Gyr. Therefore, our 
estimated ages of galaxies should be increased with i9n(4000) values, decreased with H(5a values. In 
Figure!^) and b), correlations between ages, i9n(4000) and HJa values are shown expected relation¬ 
ships, i9n(4000) — (log t)L trends with rg = 0.85 are strongly positive, and H^a — (log t)L trends with 
Ts = —0.77 are strongly negative. 

For the galaxy with emission lines, Ha emission-line is corresponded to the instantaneous star 
formation rate (SFR) of a galaxy (iKennicutt & Evari^ l2012h . Therefore the equivalent width (EW) of 
Ha is also an age indicator, which would be lager for younger galaxies. Figure|8};) shows that the light- 
weighted stellar age (log t)L correlate negatively with EW(Ha) (r^ = —0.79), as we expected. These 
relations respect that the stellar ages (log t)L we obtained by our spectral synthesis are reasonable. 

4.3.2 Stellar mass and velocity dispersion 

According to the viral theorem, for constant mass surface density, the stellar mass (log M*) is expected 
to be correlated positively with the stellar velocity dispersion (log a). In the sample of old galaxies, cr is 
related to galaxy mass through the Eaber-Jackson relation. Moreover, the stellar velocity dispersion of 
young, star forming galaxies is contributed from bulge and disc, thus it is related to galaxy mass through 
the Tully-Eisher relation. In Eigurej^l), we plot our estimates of stellar mass (log M*) with velocity 
dispersion (log a). The M* — cr relation trend strongly positive with Vg = 0.82 as we expected, which 
suggests our synthesis results are meaningful. 

We have analyzed the correlations between physical parameters obtained from stellar populations, 
such as stellar ages and stellar masses, with independent quantities. The strong correlations of (log t)L — 
i9n(4000), H^a, EW(Ha), and M* — cr suggest that the parameters derived by our spectral synthesis 
approach through ME-ICA method are reasonable and meaningful. 


5 APPLICATION TO SPECTRA OF GALAXIES WITH HIGHER REDSHIFT 


Optical galaxy redshift surveys are not only vital importance in cosmology, but also very impo rtant to 
understand physical processes related to galaxy formation and/or e volution(|Fang_eLalJ, 2015h. In the 
last few years, redshift surveys, such as 2dF Galaxy Redshift Survey (l2dFGRS: Colless et al. . 200 ih and 
SDSS, have measured redshifts of millions of low redshift (with median redshift of z = O.I) galax¬ 
ies. With larger aperture telescope, new generation redshift survey s, such as the DEEP2, BigBOSS, 
LAMOST, will measure redshifts of gala xies with higher redshift dPavis et al.L l2007t ISchlegel et al.[ 
I 2 OO 9 I: iKong & l2010t IZou et al.l |201 ll) . The motivation for this work is that we want to provide an 

easy-to-use full-spect rum fitting package and determine spectral parameters for spectra of the LAMOST 
extragalactic surveys dKong & Sull2010h . Since the regular spectroscopic survey of LAMOST just be¬ 
ginning, we apply our synthesis approach to the spectra of galaxy from the DEEP2 suryey, which has a 
similar signal-to-noise ratio as the spectrum of LAMOST dLuo et al.Ll2015h . 

In the Extended Groth Strip (EGS) field, utilizing the DEIMOS (Deep Imaging Muti-object spec¬ 
trograph) boarded on the Keck 10 m telescope, DEEP2 galaxy redshift suryey proyides the spectral data 
of galaxies with the redshift from 0 to 1.4. DEIMOS haye highest-reyolution gra ting, 1200 line mm ~^, 
typically coyering 6500 — 9100A, with a spectral resolution R = A/AA ^ 6000 dEaber et al.Ll2003h . In 
our study, we only analyze galaxies with redshift quality Q > 3, Thus, we obtained 9501 galaxies with 
(5 > 3 in the EGS, corresponding to the median redshift is 0.74. The details of these galaxies spectra 
extraction can be found in Dayis et al. (2007). Einally, we obtained about 1,400 sources with S/N > 3, 
and show some examples of the fitting in Eigure|9] It can be seen that our ME-ICA fitting method work 
well, and we will analysis their physical properties in the future work. 
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6 SUMMARY 

In this work, we have presented the MF-ICA method to compress the CB07 SSP library into a few 
Independent Components (ICs) as templates to ht observed galaxy spectra. Although there are many 
statistical multivariate data processing techniques, MF-ICA seems to be more useful. Since it has the 
capability of providing good estimates of the results by selecting proper parameters. The goal of our 
project was to estimate physical properties quickly and accurately for a large sample of galaxies. By 
using MF-ICA algorithm, we can ht an observed spectrum of galaxy only a few second, which is time- 
efficient for analysis of galaxies spectra observed by large-area surveys, such as LAMOST, BigBOSS. 

We have tested our method to ht the simulated and the SDSS DR8 galaxy spectra. Simulations 
show that the important parameters of galaxies can be accurately recovered by our method, such as 
stellar contents, star formation histories, starlight reddening and stellar velocity dispersion. 

We have compared parameters estimated from our htting technique to those obtained from the 
MPA/JHU group of DR8 galaxies. These physical parameters and measurements are in good agreement. 
We also analyze the correlations between physical parameters obtained from stellar populations with 
independent quantities. We hnd strong correlations of M* — cr, (log t)L — i7n(4000), HJa, EW(Hq;). 

In future studies, we intent to apply our htting technique to other large data bases, such as the 
LAMOST ExtraGAlactic Surveys (LEGAS), the DEEP2 galaxy redshift survey. We have htted more 
than 1,400 DEEP2 galaxies spectra, our next step will analysis their physical properties. 
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Table 1 Summary of Parameter Error Estimates for Simulated 
Spectra. The different rows list the mean square error (MSE) 
between output and input values of the corresponding quantity, 
as obtained from simulations with different signal-to-noise ratios 
(S/N). 


S/N 

MSEav (mag) 

MSEo- (km s ^) 

MSE(iog 

MSEiog(z), 

10 

0.191 

7.449 

0.201 

0.201 

20 

0.169 

6.301 

0.189 
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30 

0.119 

6.017 

0.169 

0.196 
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Fig. 1 Comparison of the input Ay (magnitude), <t (km s“^) and stellar ages (yr) with the 
output estimated values for simulations, with S/N=10, 20 and 30, using our ME-ICA method. 
The red dot-dashed line is the identity line (y = x). 
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Fig. 2 The spectra of 6 ICs estimated by EL-ICA method, some prominent spectral features 
are labeled same as Figure4 in Su et al. (2013) 




Fig. 3 Comparison of the input Ay (magnitude), a (km s“^) and stellar ages (yr) with the 
output estimated values for simulations, with S/N=20, using the EL-ICA method. The red 
dot-dashed line is the identity line (y = x). 
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Fig. 4 The spectra fitting results of some galaxies in our SDSS DR8 sample at a range of 
redshifts. The black line shows the observed spectrum, red line shows the modelled stellar 
spectrum, grey line shows the residual spectrum, and the redshift is labeled in the top left 
corner of each panel. 
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Fig. 5 The values of stellar extinction Ay and stellar mass M* computed from the MPA/JHU 
database versus those values computed by our code. The dot-dashed line is the identity line 
(y = x). The dashed line in the right panel is a robust fit for the relation. The number in the 
top left corner is the Spearman rank correlation coefficient. 
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Fig. 6 The comparison of equivalent widths of H/3, [Olll]A5007, [Ol]A6300, Ha, 
[N ll]A6584 and [S ll]A6717 measured by the MPA/JHU group with those by our code. The 
red dot-dashed line is the identity line (y = x), while the number in the bottom-right corner 
of each panel indicates the mean square error (MSE). 
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Fig. 7 Plot of our estimated nebular oxygen abundance against those obtained by the 
MPA/JHU group. The dot-dashed line is the identity line (y = x), while the number in the 
top left corner is the Spearman rank correlation coefficient. 
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Fig. 8 Relations of the 4000A break index versus the light-weighted mean stellar age (a); 
the H(5 a index versus the light-weighted mean stellar age (b); the equivalent width of Ha 
versus the light-weighted mean stellar age (c); the comparison of our estimated stellar mass 
(log M*) with velocity dispersion (log cr) (d). The number in the top left corner of each panel 
is the Spearman rank correlation coefficient r^. 
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Fig. 9 The spectra fitting results of some galaxies in our DEEP2 sample at a range of red- 
shifts, which are labeled in the top left cornor of each panel. The black line shows the observed 
spectrum, red line shows the modelled stellar spectrum, and grey line shows the residual spec¬ 
trum. We also mask the “telluric absorption” regions between dashed lines (observed frame: 
7750 - 7700A & 6850 - 6900A). 






